Draft

OGC Engineering Report

July 2024 Open Standards Code Sprint Summary Engineering Report
Gobe Hobona Editor Joana Simoes Editor
OGC Engineering Report

Draft

Document number:24-034
Document type:OGC Engineering Report
Document subtype:Implementation
Document stage:Draft
Document language:English

License Agreement

Use of this document is subject to the license agreement at https://www.ogc.org/license



I.  Executive Summary

The opportunities presented by Artificial Intelligence (AI) have led to rapid adoption of many AI-related technologies. At the same time, the rapid adoption of AI has highlighted the need for good quality training data to enable the systems to offer effective decision making support. Data quality is therefore an enduring requirement for future geospatial technologies just as it has always been for historic ones. Whereas some of those future geospatial technologies will likely rely on raster data, many others will on vector feature data. Therefore validators capable of checking the validity of vector feature data files are likely to play a key role in an AI-driven future.

The focus of this Engineering Report (ER) is a code sprint that was held from July 10th to 12th, 2024 to advance the support and development of open standards within the developer community. The code sprint was organized by the Open Geospatial Consortium (OGC) and hosted by Geovation in London, England. The code sprint was sponsored by Google and supported by Natural Resources Canada (NRCan). The code sprint included activities involving several OGC API Standards and data encoding standards, as well as special tracks on Data Quality & Artificial Intelligence, Map Markup Language (MapML) and Validators.

The code sprint was held as a generic code sprint meaning that all OGC working groups were encouraged to participate in the event. As a result, several OGC Standards Working Groups (SWGs) set up teams of developers to collaborate during the three-day event. In addition to providing software developers with an environment for collaborative coding and experimentation, the code sprint also provided opportunities for thought leadership through presentations and tutorials in the Mentor Stream. This made the code sprint a rich environment for knowledge transfer across teams, as well as for nurturing cross-functional teams.

The sprint participants made the following recommendations regarding future Collaborative Solutions and Innovation Program initiatives:

  • Initiative for urban Digital Twins

  • CDB2 experimentation in the context of Digital Twins

  • OGC API — 3D Geovolumes experimentation in the context of Digital Twins

  • Experimentation on consistency of metadata frameworks

  • An activity building on the ISO metadata activity

  • Experimentation on consistency of parameter and schema fragements in APIs

  • Prototyping and experimentation on OGC API — Features and Geocoding

  • Prototyping of an HTML MapML validator, possibly as a service. See https://github.com/Maps4HTML/validator-mapml for ideas.

The sprint participants made the following recommendations regarding future Standards Program initiatives:

  • Discussions on consistency of parameter and schema fragements in APIs

  • Discussion possibility of TrainingDML-AI conf class for Records

  • Discuss consistency of scale of OMS and APIs

  • Addition of a security element in future versions of TrainingDML-AI and other metadata encodings

II.  Keywords

The following are keywords to be used by search engines and document catalogues.

ogcdoc, OGC document, API, openapi, html, tdml-ai, mapml, json-fg

III.  Submitters

All questions regarding this document should be directed to the editors or the contributors:

Table — Submitters

NameOrganizationRole
Gobe HobonaOGCEditor
Joana SimoesOGCEditor
Tom KralidisOSGeoContributor
Chris LittleMet OfficeContributor
Frank TerpstraGeonovumContributor
Maxime CollombinUniversity of Applied Sciences, Western Switzerland (HEIG-VD)Contributor
Sam MeekHelyx Secure Information SystemsContributor
Rui CavacoNorte Portugal Regional Coordination and Development CommissionContributor
TBATBAContributor
TBATBAContributor

IV.  Abstract

The subject of this Engineering Report (ER) is a code sprint that was held from July 10th to 12th, 2024 to advance the support and development of open standards within the developer community. The code sprint was organized by the Open Geospatial Consortium (OGC) and hosted by Geovation in London, England. The code sprint was sponsored by Google and supported by Natural Resources Canada (NRCan). The code sprint included activities involving several OGC API Standards and data encoding standards, as well as special tracks on Data Quality & Artificial Intelligence, Map Markup Language (MapML) and Validators.

1.  Introduction

OGC Code Sprints experiment with emerging ideas in the context of geospatial Standards and help improve interoperability of existing Standards by experimenting with new extensions or profiles. They are also used for building proofs-of-concept to support standards development activities and the enhancement of software products. The nature of the activities is influenced by whether a code sprint is ‘generic’ or ‘focused’. All OGC working groups are invited and encouraged to set up a thread in generic code sprints, whereas focused code sprints are tailored to a specific set of standards (typically limited to three standards).

This ER presents the high-level architecture of the code sprint and describes each of the standards and software packages that were deployed in support of the code sprint. The ER also discusses the results and presents a set of conclusions and recommendations. The recommendations identify ideas for future work, some of which may be more appropriate for testbeds, pilots, or other types of OGC initiatives. Therefore, the reader is encouraged to consider the recommended future work within the context of all OGC Standards development, collaborative solutions, and innovation activities.

2.  Terms, definitions and abbreviated terms

This document uses the terms defined in OGC Policy Directive 49, which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this document and OGC documents do not use the equivalent phrases in the ISO/IEC Directives, Part 2.

This document also uses terms defined in the OGC Standard for Modular specifications (OGC 08-131r3), also known as the ‘ModSpec’. The definitions of terms such as standard, specification, requirement, and conformance test are provided in the ModSpec.

For the purposes of this document, the following additional terms and definitions apply.

An Application Programming Interface (API) is a standard set of documented and supported functions and procedures that expose the capabilities or data of an operating system, application, or service to other applications (adapted from ISO/IEC TR 13066-2:2016).

A coordinate system that is related to the real world by a datum term name (source: ISO 19111).

A document (or set of documents) that defines or describes an API. An OpenAPI definition uses and conforms to the OpenAPI Specification (https://www.openapis.org).

An API using an architectural style that is founded on the technologies of the Web [source: OGC API — Features — Part 1: Core].

2.5.  Abbreviated terms

API

Application Programming Interface

CITE

Compliance Interoperability & Testing Evaluation

CRS

Coordinate Reference System

EDR

Environmental Data Retrieval

GIS

Geographic Information System

OGC

Open Geospatial Consortium

OWS

OGC Web Services

REST

Representational State Transfer

TEAM

Test, Evaluation, And Measurement Engine

3.  High-Level Architecture

The focus of the code sprint was on the support of implementations of open geospatial standards across various software projects. Implementations of approved and candidate OGC Standards were deployed in participants’ own infrastructure in order to build an environment with the architecture shown below in Figure 1. As illustrated, the sprint architecture was designed to enable client applications to connect to different servers that implement a variety of standards. The architecture also included several different software libraries that support open geospatial standards and enable the extraction, transformation, and loading of geospatial data.

Figure 1 — High Level Overview of the Sprint Architecture

The rest of this section describes the software deployed, and standards implemented during the code sprint.

3.1.  Approved OGC Standards

3.1.1.  OGC SensorThings API

The OGC SensorThings API Standard provides an open and harmonized way to interconnect devices, applications, and data over the web and on the Internet of Things (IoT) (OGC 18-088). At a high level the SensorThings API provides two main parts, namely Part I — Sensing, and Part II — Tasking. The Sensing part of the Standard provides a way to manage and retrieve observations and metadata from different sensor systems. The Tasking part of the Standard provides a way for tasking IoT devices, such as actuators and sensors. The SensorThings API follows REST principles and uses JSON for encoding messages as well as Message Queuing Telemetry Transport (MQTT) for publish/subscribe operations.

3.1.2.  OGC API — Features

The OGC API — Features Standard offers the capability to create, manage, and query spatial data on the Web. The Standard specifies requirements and recommendations for Web APIs that are designed to facilitate the sharing of feature data. The specification is a multi-part standard. Part 1, labelled the Core, describes the mandatory capabilities that every implementing service has to support and is restricted to read-access to spatial data that is referenced to the World Geodetic System 1984 (WGS 84) Coordinate Reference System (CRS) (OGC 17-069r4). Part 2 enables the use of different CRSs, in addition to the WGS 84 (OGC 18-058r1). Additional capabilities that address specific needs will be specified in additional parts. Envisaged future capabilities include, for example, support for creating and modifying data, more complex data models, and richer queries.

3.1.3.  OGC API — Tiles

OGC API — Tiles specifies a Standard for Web APIs that provide tiles of geospatial information (OGC 20-057). The Standard supports different forms of geospatial data, such as tiles of vector features (colloquially called “vector tiles”), coverages, maps (or imagery), and potentially eventually additional types of tiles of geospatial data.

Vector data represents geospatial objects such as points, lines, and polygons. Tiles of vector feature data (i.e., ‘vector tiles’) represent partitions of vector data covering an area (e.g., lines representing rivers in a country).

In this context, a map is essentially an image representing at least one type of geospatial information. Tiles of maps (i.e., map tiles) represent subsets of maps covering an area.

3.1.4.  OGC API — Environmental Data Retrieval

The OGC API — Environmental Data Retrieval (EDR) Standard provides a family of lightweight interfaces to access Environmental Data resources. Each resource addressed by an EDR API maps to a defined query pattern. This Standard identifies resources, captures compliance classes, and specifies requirements which are applicable to OGC Environmental Data Retrieval API’s. This Standard addresses both discovery and query operations. Discovery operations enable the API to be interrogated to determine its capabilities and retrieve metadata about the published resource. Query operations allow Environmental Data resources to be retrieved from the underlying data store based upon simple selection criteria, defined by this standard and selected by the client.

Version 1.1 of OGC API — EDR has been published (OGC 19-086r6). The EDR API Standards Working Group (SWG) has recently obtained approval to publish Part 2 of the Standard: “OGC API — Environmental Data Retrieval — Part 2: Publish-Subscribe workflow” ([bib_edrpart2]). The focus of the EDR API-related work in this code sprint is therefore on the use of Part 2 of the Standard. Work continues on defining improvements to Part 1: Core Version 1.1 to be known as Version 1.2.

3.1.5.  OGC API — Processes

The OGC API — Processes Standard supports the wrapping of computational tasks into executable processes that can be offered by a server through a Web API and be invoked by a client application (OGC 18-062r2). The standard enables the execution of computing processes and the retrieval of metadata describing the purpose and functionality of the processes. Typically, these processes execute well-defined algorithms that ingest vector and/or coverage data to produce new datasets.

OGC API — Processes — Part 2: Deploy, Replace, Undeploy (draft) extends the core capabilities specified in OGC API — Processes — Part 1: Core ([OGC_18-062r2]) with the ability to dynamically add, modify and/or delete individual processes using an implementation (endpoint) of the OGC API — Processes Standard.

3.2.  Candidate OGC Standards

3.2.2.  OGC API — Maps

The OGC API — Maps candidate Standard describes an API that can serve spatially referenced and dynamically rendered electronic maps ([bib_ogcapimaps]). The specification describes the discovery and query operations of an API that provides access to electronic maps in a manner independent of the underlying data store. The query operations allow dynamically rendered maps to be retrieved from the underlying data store based upon simple selection criteria as defined by the client.

3.2.3.  OGC API — Records

The OGC API — Records candidate Standard provides discovery and access to metadata records that describe resources such as features, coverages, tiles / maps, models, assets, datasets, services, or widgets ([bib_ogcapirecords]). The candidate Standard enables the discovery of geospatial resources by standardizing the way collections of descriptive information about the resources (metadata) are exposed. The candidate Standard also enables the discovery and sharing of related resources that may be referenced from geospatial resources or their metadata by standardizing the way all kinds of records are exposed and managed.

3.2.5.  OGC Features and Geometries JSON (JSON-FG)

The draft OGC Features and Geometries JSON (JSON-FG) Standard extends the GeoJSON format to support a limited set of additional capabilities that are out-of-scope for GeoJSON but that are important for a variety of use cases involving feature data ([bib_jsonfg]). In particular, the JSON-FG Standard specifies the following extensions to the GeoJSON format:

  • the ability to use Coordinate Reference Systems (CRSs) other than WGS 84;

  • support for solids and prisms as geometry types;

  • the ability to encode temporal characteristics of a feature; and

  • the ability to declare the type and the schema of a feature.

3.3.  Specifications from the Community

3.3.2.  ISO 19157-3

To facilitate dataset comparisons, evaluations and data quality reports (metadata or a quality evaluation report) have to be expressed in a comparable way, and it is necessary to have a common understanding of the data quality measures that have been used. An example of such common understanding of a standard quality measure is defined in ISO 19157-1:2023 Geographic information — Data quality — Part 1: General requirements. This standard defines the structure of a data quality measure as well as all additional attributes describing a data quality measure — see in …​. [insert Figure from here: https://github.com/i2vana/ISO19157-3/blob/main/OGC-CodeSprint/img/ISO19157-1_dataQualityMeasure.PNG]

To comply with current best practice for sharing data over the web, these measures have to reside in a machine-actionable data quality measures register. Such register is currently under development at ISO/TC211 and OGC. ISO 19157-3 Geographic information — Data quality — Part 3: Data quality measures register will be the standard defining the components and content structure of a register for data quality measures, and the registration and maintenance procedure. All of this is compliance with ISO 19135 Geographic information — Registration and registration procedures, a governing standard for all ISO/TC211 registers. The measures will be hosted in the ISO 19157-3 register will be hosted at OGC, the Registration Authority of ISO 19157-3. Current version implemented at OGC RAINBOW contains first set of 80+ recognized standard data quality measures (e.g. such as the Root Mean Square Error used for evaluation positional accuracy, or the Misclassification Matrix used to evaluate the attribute accuracy) and these were used in first few tests as part of the Code Sprint. Full version of the ISO 19157-3 Data Quality measures register is expected to be published together with the ISO 19157-3 standard in early 2026.

3.4.  Software Projects and Products

3.4.2.  QGIS

The QGIS stable release V3.28.6-Firenze, and later versions, has a Time Slider/Controller added to the menu bar. It behaves like a video controller if relevant to the data being displayed.

This and later versions also support the OGC API-EDR queries via a plugin.

Release V3.37 and later now supports a Vertical Slider for data and layers that have a vertical extent.

Figure 2 — Screenshot of QGIS Time Controller

Figure 3 — Screenshot of QGIS EDR Plugin menu

3.4.3.  Status of OGCAPI SourceType support in QGIS

QGIS supports adding a Raster or Vector Layer, using an OGC API Source Type. Under the hood, QGIS uses the OGCAPI GDAL driver. During this code sprint, this functionality was tested for different OGC APIs, in order to figure out what is working and to try to understand if the issue is on GDAL or QGIS.

An issue with relative links was identified on GDAL, which was affecting all the APIs. The issue was already fixed and is described here.

The current status (using QGIS version 3.39.0-Master QGIS code revision 399f7df1c7 and GDAL/OGR version 3.10.0dev-126a88523a) is:

  • OGC API — Features is not working as expected, as described on this issue.

  • OGC API — Tiles (vector) is not working, as described on this issue.

  • OGC API — Tiles (raster) is working.

  • OGC API — Maps is working.

  • OGC API — Coverages is working.

Figure 4 — Screenshot an OGCAPI - Maps collection from Gnosis on QGIS

Figure 5 — Screenshot an OGCAPI - Maps collection from pygeoapi on QGIS

Figure 6 — Screenshot an OGCAPI - Tiles (raster) collection on QGIS

Figure 7 — Screenshot an OGCAPI - Coverages collection on QGIS

More details about the setup for testing this functionality can be found on this issue.

3.4.5.  Geonovum JSON Linter

We took the existing JSON-FG linter and renamed it to the OGC-Checker. We added OGC API Features part 1 functionality and a small portion of the OGC API Common standard. The tool is now capable of automatically discovering which conformance classes are implemented and can apply the appropriate rulesets for validation based on that. This allows us to add other OGC API standards in the future. With this, we’ve taken the first step toward making the linter/validator more generic. The results can be found here: https://github.com/Geonovum-labs/ogc-checker, a live demo is available here: https://geonovum-labs.github.io/ogc-checker/#/ogc-api Testing the linter on examples of OGC API features resulted in discovering several small issues that were resolved by Clemens during the codesprint.

3.4.6.  OSGeo pygeoapi

pygeoapi is a Python server implementation of the OGC API suite of Standards. The project emerged as part of the next generation OGC API efforts in 2018 and provides the capability for organizations to deploy a RESTful OGC API endpoint using OpenAPI, GeoJSON, and HTML. pygeoapi is open source and released under an MIT license. pygeoapi is an official OSGeo Project as well as an OGC Reference Implementation. pygeoapi supports numerous OGC API Standards. The official documentation provides an overview of all supported standards.

3.4.7.  OSGeo pygeometa

pygeometa provides a lightweight and Pythonic approach for users to easily create geospatial metadata in standards-based formats using simple configuration files called Metadata Control Files (MCF). The software has minimal dependencies (the installation is less than 50 kB), and provides a flexible extension mechanism leveraging the Jinja2 templating system. Leveraging the simple but powerful YAML format, pygeometa can generate metadata in numerous standards. Users can also create their own custom metadata formats which can be plugged into pygeometa for custom metadata format output. pygeometa is open source and released under an MIT license.

For developers, pygeometa provides a Pythonic API that allows developers to tightly couple metadata generation within their systems and integrate nicely into metadata production pipelines.

The project supports various metadata formats out of the box including ISO 19115, the WMO Core Metadata Profile, and the WIGOS Metadata Standard. The project also supports the OGC API — Records core record model as well as STAC (Item).

3.4.8.  OSGeo OWSLib

OWSLib is a Python client for OGC Web Services and their related content models. The project is an OSGeo Community project and is released under a BSD 3-Clause License.

OWSLib supports numerous OGC standards, including increasing support for the OGC API suite of standards. The official documentation provides an overview of all supported standards.

3.4.9.  ldproxy

ldproxy is an implementation of the OGC API family of Standards, available under the MPL 2.0 open source license. ldproxy is developed by interactive instruments GmbH, written in Java, and is deployed using Docker containers. ldproxy implements all parts of OGC API — Features, OGC API — Tiles, OGC API — Styles, OGC API — 3D GeoVolumes, and OGC API — Routes. ldproxy is an OGC Reference Implementation for Parts 1 and 2 of OGC API — Features.

3.4.10.  CubeWerx Geospatial Data Server

The CubeWerx Geospatial Data Server (“cubeserv”) is implemented in C and currently implements the following OGC Standards and draft specifications.

  • Multiple conformance classes and recommendations of the OGC API — Tiles — Part 1: Core Standard

  • Multiple conformance classes and recommendations of the OGC API — Maps — Part 1: Core candidate Standard

  • All conformance classes and recommendations of the OGC API — Features — Part 1: Core Standard

  • Multiple conformance classes and recommendations of the OGC API — Records — Part 1: Core candidate Standard

  • Multiple conformance classes and recommendations of the OGC API — Coverages — Part 1: Core candidate Standard

  • Multiple conformance classes and recommendations of the OGC API — Processes — Part 1: Core Standard

  • Multiple versions of the Web Map Service (WMS), Web Processing Service (WPS), Web Map Tile Service (WMTS), and Web Feature Service (WFS) Standards

  • A number of other “un-adopted” OGC Web Service draft specifications including the Testbed-12 Web Integration Service, OWS-7 Engineering Report — GeoSynchronization Service, and the Web Object Service prototype

The cubeserv executable supports a wide variety of back ends including Oracle, MariaDB, SHAPE files, etc. It also supports a wide array of service-dependent output formats, for example, Geography Markup Language (GML), GeoJSON, Mapbox Vector Tiles, MapMP, as well as several coordinate reference systems.

3.4.11.  GNOSIS Map Server

The GNOSIS Map Server is written in the eC programming language and supports multiple OGC API Standards. GNOSIS Map Server supports multiple encodings including GNOSIS Map Tiles (which can contain either vector data, gridded coverages, imagery, point clouds, or 3D meshes), Mapbox Vector Tiles, GeoJSON, GeoTIFF, GML, and MapML. An experimental server is available online at https://maps.gnosis.earth/ogcapi and has been used in multiple OGC Innovation Program initiatives.

4.  Results

The code sprint included multiple software applications and experimented with several standards. This section presents the key results from the code sprint.

4.2.  Candidate OGC Standards

4.2.4.  OGC Styles and Symbology

During this code sprint, the sprint participants worked on drafting the OGC Styles & Symbology candidate standard. The work included:

  • explaining the SymCore concept to Louis-Martin from Bentley Systems and discuss possible use cases for hatch, stipple & pattern fills,

  • addition of a Metadata section in the https://docs.ogc.org/DRAFTS/18-067r4.html#rc-core [Requirements Class “Core”] with keywords and geoDataClasses parameters in line with the JSON schema,

  • discussion and clarification of the concepts of Stipples and Hatches,

  • drafting requirements with portrayal examples,

  • creating placeholders for the Abstract Test Suite,

  • discussion and clarification of viz.pass, feature.pass and zOrder concepts,

  • a discussion also took place on consistency of UML class diagrams (see #74),

  • work on a future CartoSym transcoder,

  • discussion with Tom Kralidis on the possibility of implementing OGC API — Styles in pygeoapi based on the GetStyles operation of a WMS service,

  • discussion with Gobe Hobona on the relevance of using Enterprise Architect for UML modelling,

  • work on updating the SWG charter, clarifying the work items for Cartographic Symbology 2.0 parts 1-4, and proposing to rename the SWG to “Cartographic Symbology” to align with the name of the standard as well as the “CartoSym” encodings, and avoiding confusion with the scope of the OGC API — Styles SWG.

A number of GitHub issues were created as a result of this work. This included:

  • #24: Custom fill

  • #57: Add keywords element in Style node

  • #59: Rename hatchstyle to hatch and new properties

  • #62: Move Metadata in 1-core

  • #63: Complete full list of requirements for 1-core

  • #64: Rename stippleStyle to stipple

  • #71: Add support to allow arbitrary Expressions (not only literals) for all Symbolizer properties

  • #74: UML class diagrams

  • #76: Clarify viz.pass, feature.pass and zOrder concepts

4.2.4.1.  Conclusion

The code sprint enabled the editors of the OGC Cartographic Symbology candidate standard to work together on drafting the standard. They were able to clarify some concepts, requirements and use cases and moved the standard forward. A number of important tasks remain to be completed before CartoSym 2.0 — Part 1 can be finalized.

These tasks include:

  • detailing the requirements, including examples of rendering with the corresponding CartoSym-CSS encoding examples in the Requirements Classes and appendix B: Mapping of SLD/SE and notable vendor extensions to the Conceptual Model

  • update the UML class diagrams (see #74),

  • writing the Annex A: “Abstract Test Suite (Normative)” as already started here,

  • implementing transcoders and rendering engines,

  • define a global UML class hierarchy diagram covering all of the Requirements Classes. Enterprise Architect could be a useful tool for this. It would also allow us to generate an XSD schema and a JSON schema so that we can check the consistency of the existing schema. The UML to JSON conversion tool imvertor-lite could also be considered,

  • check that the JSON schema complies with the candidate Best Practice for OGC — UML to JSON Encoding Rules.

4.3.  Specifications from the Community

4.3.1.  MapML

Several participants worked on this MapML track.

Rui Cavaco‘s work (in person at code sprint) had three different ‘tracks’:

  • get a more complete understanding of MapML concept, experimenting with reference implementation polyfill’s examples;

  • experimenting with JavaScript, possibly extending some polyfill’s funcionality;

  • understand polyfill suport to projected CRS’s such as ‘national grids’.

At first experiments it was clear that MapML itself, and the reference implementation polyfill, show several capabilities which manifest themselves when one goes beyond the simplest OSM tiling examples.

Three examples of this:

  • the map-extent element (and the possibility of existing several for each layer);

  • the combining of “projection” attribute and of “units” attribute;

  • the OSMTILES, CBTILES and other keywords for “projection” or “units”.

The necessity to go through several examples to fully grasp the MapML capability became very clear.

Also became quite clear that it is not crucial to extend polyfill’s functionality, since it is supposed that such functionality should be, sooner or later, transferred to browser’s code bases.

So Rui Cavaco’s final work was dedicated to JavaScript DOM manipulation using, one of MapML’s most interesting features: the ability to, easily, add dynamic changes to web maps. In this case, a real world problem was addressed, the ability to add to and remove from, a web map, at user’s request, some municipal-scale themes from Northern Portugal region.

To this purpose a webapp was built using MapML polyfill from Maps4HTML and Holoviz Panel Python web framework.

Slides describing this work in more detail can be found here. The code repository on GitHub you can find it here.

4.4.  Software Projects and Products

4.4.1.  OSGeo pygeoapi

4.4.1.1.  Mentor stream: Adding a new OGC API to pygeoapi

Given recent updates to pygeoapi in support of API implementation modularity, a mentor stream was given, focusing on adding a new OGC API to pygeoapi. Developers were shown how to add and hook new API functionality / endpoints into the pygeoapi core, as well as exposing via pygeoapi’s OpenAPI functionality. A presentation and example application were demonstrated and made available as proof of concept.

Figure 8

4.4.1.1.1.  OGC API — Processes — Part 2 implementation

An initial prototype was implemented to support the OGC API — Processes — Part 2: Deploy, Replace, Undeploy draft standard. Process creation was implemented by way of ingesting a Common Workflow Language (CWL) definition which referenced a Python application made available as a Docker image. The Python application implemented water body detection, which took as input Copernicus Sentinel-2 or USSG Landsat-9 data and detected water bodies by applying the Otsu thresholding technique on the Normalized Difference Water Index (NDWI)1. This application was made available as an example in support of the OGC Best Practice for Earth Observation Application Package.

Figure 9

The CWL was published and made available as an OGC API Process. The process description included an executionUnit object providing the CWL definition.

Process execution then invoked the cwltool reference implementation to deploy and run the application in a portable manner, producing a manifest of a STAC Catalog of the NDWI outputs.

Figure 10

The architecture of the implementation can be found below, and the associated implementation on GitHub. Future work includes process replace and delete functionality, as well as describing CWL inputs and outputs as part of the native process description model.

Figure 11

Special thanks is given to the Earth Observation Application Package resources and examples provided on GitHub as well as Gérald Fenoy of GeoLabs SARL for providing valuable explanation, advice and recommendation.

4.4.2.  OSGeo pygeometa

Upon participating in the special track and breakout session focused on Data Quality and Artificial Intelligence, an initial Training Data Markup Language for AI (TrainingDML-AI) encoder was implemented in pygeometa. This enabled configuring a training dataset as a pygeometa metadata control file (MCF) configuration and generating TrainingDataML-AI (or any other metadata format supported by pygeometa). Note that the MCF model was extended to support specifics of TrainingDataML-AI.

The initial exercise illustrated a number of intersections between TrainingDataML-AI and common metadata constructs (spatiotemporal extents, keywords, contacts, data quality, etc.) in support of discovery and documentation. As a result, an additional implementation was put forth to create a TrainingDataML-AI profile of the OGC API — Records — Part 1: Core, Record Core model. This implementation mapped the common metadata constructs into the Record Core model, and encoded TrainingDataML-AI specifics accordingly.

The result of this experiment was the encoding of a training dataset description as a metadata record that could then enable low barrier, broad interoperability via GeoJSON and OGC API — Records.

Figure 12

The resulting implementation can be found on GitHub.

5.  Discussion

5.2.  OGC API Standards

This section discusses findings that apply to multiple OGC API Standards, as well as implications of those findings.

5.3.  Data Quality and Artificial Intelligence

Concepts around data quality have evolved over the past 20 or so years with the advent and proliferation of open sourced, non-authoritative data sources. Standards such at ISO 19115 Geospatial Metadata and ISO 19157 Geospatial Data Quality have been used and incorporated into metadata profiles to record the quality of geospatial datasets. One of the grounding concepts within the ISO standards is the universe of discourse which is poorly translated to ground truth or real world. In the early days of ISO concepts of data quality, the universe of discourse was much more simple to determine because the authoritative datasets were routinely ground truthed for accuracy. A salient example of this aerial photography where flight paths were planned and a ground crew would lay out tie points on the ground to be captured by the imagery. Distances then could be physically measured to create an overall positional accuracy for the image. In the modern world of non-authoritative, crowdsourced or open source data, the universe of discourse concept does not apply in the same way. The data quality model reports on concepts such as:

  • Completeness — the number or percentage of features that appear within the dataset compared to the universe of discourse.

  • Thematic Accuracy — the number or percentage of misclassified features.

  • Logical Consistency — the number of missing links in a road dataset.

  • Conceptual Consistency — number of overlapping surfaces in a dataset.

A key element of data quality is a record of how the dataset has been processed including information about the computational manipulation as well as individuals and organizations who have completed the processing. In ISO 19115, this is described in the lineage classes, and specifically through the LE_Processing Class. This class enables recording of the specific processing and runtime parameters required to create a data quality metric and recreate the lineage if necessary.

The OGC has been exploring the use of reusable building blocks to enable elements of standards to be re-used, extended, and modified with a FAIR (Findable, Accessible, Interoperable, Reusable) way. The objective of the data quality exploration work in this codesprint was to take these concepts to address the following questions:

  1. Can we create machine readable provenance chains to enable scientific data re-creation?

  2. Can machine readable provenance chains be automatable, and executable?

  3. Can we use the OGC building block RAINBOW server to store and refer to mathematical formulae?

  4. Can the formulae be parsed into machine readable formats, new variables injected, and the provenance chain executed?

5.3.1.  Approach

One of the major focuses of the CodeSprint was the use, testing, and enhancement of Training Data Markup Language (TrainingDML). Therefore, this was used as the target standard to test the approach. The example dataset is a simple, reduced and modified version of the dataset link. This is shown below:

{
   
"type": "DataQuality",
   
"scope": {
     
"level": "dataset",
     
"levelDescription": [
       
{
         
"dataset": "main.bld_fts_building"
       
}
     
]
   
},
   
"report": [
     
{
       
"type": "PositionalAccuracy",
       
"measure": {
           
"measureIdentification": {
               
"code": "FT28_2",
               
"authority": "https://defs-dev.opengis.net/vocprez-hosted/object?uri=https%3A//standards.isotc211.org/19157/-3/1/dqc/content/formulaType/"
           
},
           
"nameOfMeasure": [
               
"Absolute Value of mean error to the standard deviation",
               
"Horizontal"
           
],
           
"measureDescription": "Calculates the mean error of the standard deviation"
       
},
       
"evaluationMethod": {
         
"evaluationMethodDescription": "Uses the standard mathematical formula"
       
},
       
"result": [
         
{
           
"quantitativeResult": {
             
"value": [
               
0.75643
             
],
             
"valueUnit": "real"
           
}
         
}
       
]
     
}
   
]
}

Listing 1

The test dataset contains the the PositionalAccuracy element which implements the the measureIdentification element that in turn includes the reference to the authority and a code that corresponds to the data quality formula expressed in MathML. The objective of the software developed in this CodeSprint was to parse the MathML to make it machine readable, machine executable, and parsable. The output was successful to a point of making the formula machine readable, but it was not possible to inject new variables into the formula to execute the chain for another dataset.

5.3.1.1.  Recommendations & Next Steps

The use cases for machine readable and executable provenance chains should be widened to include a specific ML training data use case. The ability to sample, correct, and train large datasets should have a reproducible method for training models. The recommendations are as follows:

  • Run a CodeSpring focused on machine readable provenance chains, model reproducibility, and recording data quality.

  • Update the RAINBOW server to offer executable code and variable injection to ensure the chains are machine readable.

  • Test the process with several different datasets with a focus on model reproducibility.

  • Feedback to ISO on findings.

5.4.  Relationship between OGC SensorThings API and OGC API — Connected Systems

The code sprint hosted a discussion on the relationship between the OGC SensorThings API Standard and the OGC API — Connected Systems candidate Standard. The purpose of the discussion was to determine the commonalities and differences between the OGC SensorThings API Standard and OGC API – Connected Systems candidate Standard so that a consistent message can be presented to the market. Early in the session, it was observed that collaboration between the SWGs could be improved. It is envisaged therefore that the discussion summarised in this section will help to improve collaboration between the SWGs and consistency in messaging.

The first activity during the session was to review the table of design choices which is presented in Annex B.9 of OGC API — Connected Systems — Part 1: Feature Resources. The table presents a comparison of the design choices made in the development of the two standards. The participants noted that the OGC SensorThings API specifies support for ‘any’ type of observation and thus does not identify a set list of supported observation types. In contrast, OGC API – Connected Systems lists the observation types that the standard supports namely scalar, vector, N-D coverage, and video.

Table 1 — Comparison of design choices made in OGC API — Connected Systems and OGC SensorThings API (reproduced from a draft version of OGC 23-001 on 19 July 2024)

Design ChoiceConnected SystemsSensorThings
API PlatformExtension of OGC API Common and OGC — API Features.OData Version 4.0
Query LanguageQuery string arguments, decoupled from resource encoding.Generic query language inherited from OData.
Resource ModelBased on SOSA/SSN/OMS and SensorML.Simplified and adapted from O&M.
Supported Observation TypesScalar, vector, N-D coverage, video.Scalar and simple records only.
Multiple Format SupportYes, including non-JSON such as Protocol Buffers or other binary formats.OData compatible JSON only.

The participants noted that rather than focus on how the standards implement certain capabilities, the discussion should instead focus on the use cases that addressed by the standards. There was agreement that use cases would be more appropriate for helping to establish a consistent message to the market. It was noted that several technical and domain use cases are documented in the Reviewers Guide of OGC API – Connected Systems. At the time of the code sprint, the reviewers’ guide offered 25 use cases.

There was an observation made that the ability to identify, describe in detail and integrate different sensor systems is one of the motivations for OGC API – Connected Systems and thus a key differentiator from OGC SensorThings API. That is, as the name suggests, OGC API — Connected Systems is designed to support the deployment and connection of various systems, some of which are likely to include sensors. The domain use cases presented in the Reviewers’ Guide include use cases that highlight this ability to connect different systems.

The key differentiators can therefore be summarised as follows:

  • The level of complexity of the system(s) and what you need to describe them

  • How you can query that complexity

  • Connectivity between the systems

Taking the example of a satellite, the participants explained that an application that required knowledge of where a satellite was, how the satellite was operated, and the resulting observational data would require the capabilities offered by OGC API — Connected Systems. In contrast, an application that only required the resulting observational data and not details of the system collecting the data could be sufficiently supported by an implementation of the OGC SensorThings API.

6.  Conclusions

TBA


Annex A
(informative)
Revision History

Table — Revision History

DateReleaseAuthorPrimary clauses modifiedDescription
2024-07-110.1G. Hobonaallinitial version
2024-07-140.2T. KralidisallOSGeo updates

Bibliography

[1]  Gobe Hobona, Joana Simoes, Tom Kralidis, Martin Desruisseaux, Angelos Tzotsos: OGC 23-025, 2023 Open Standards and Open Source Software Code Sprint Summary Engineering Report. Open Geospatial Consortium (2023). http://www.opengis.net/doc/PER/ogc-osgeo-asf-codesprint2023.

[2]  Carl Reed: OGC 15-113r6, Volume 1: OGC CDB Core Standard: Model and Physical Data Store Structure. Open Geospatial Consortium (2021). http://www.opengis.net/doc/IS/CDB-core/1.2.0.

[3]  Mark Burgoyne, David Blodgett, Charles Heazel, Chris Little: OGC 19-086r6, OGC API — Environmental Data Retrieval Standard. Open Geospatial Consortium (2023). http://www.opengis.net/doc/IS/ogcapi-edr-1/1.1.0.

[4]  Clemens Portele, Panagiotis (Peter) A. Vretanos, Charles Heazel: OGC 17-069r4, OGC API — Features — Part 1: Core corrigendum. Open Geospatial Consortium (2022). http://www.opengis.net/doc/IS/ogcapi-features-1/1.0.1.

[5]  Clemens Portele, Panagiotis (Peter) A. Vretanos: OGC 18-058r1, OGC API — Features — Part 2: Coordinate Reference Systems by Reference corrigendum. Open Geospatial Consortium (2022). http://www.opengis.net/doc/IS/ogcapi-features-2/1.0.1.

[6]  Benjamin Pross, Panagiotis (Peter) A. Vretanos: OGC 18-062r2, OGC API — Processes — Part 1: Core. Open Geospatial Consortium (2021). http://www.opengis.net/doc/IS/ogcapi-processes-1/1.0.0.

[7]  Joan Masó, Jérôme Jacovella-St-Louis: OGC 20-057, OGC API — Tiles — Part 1: Core. Open Geospatial Consortium (2022). http://www.opengis.net/doc/IS/ogcapi-tiles-1/1.0.0.

[8]  Tatjana Kutzner, Carl Stephen Smyth, Claus Nagel, Volker Coors, Diego Vinasco-Alvarez, Nobuhiro Ishi: OGC 21-006r2, OGC City Geography Markup Language (CityGML) Part 2: GML Encoding Standard. Open Geospatial Consortium (2023). http://www.opengis.net/doc/IS/CityGML-2/3.0.0.

[9]  Andreas Matheus: OGC 22-022r1, OGC SensorThings API Extension: STAplus 1.0. Open Geospatial Consortium (2023). http://www.opengis.net/doc/is/sensorthings-staplus/1.0.0.

[10]  Steve Liang, Tania Khalafbeigi, Hylke van der Schaaf: OGC 18-088, OGC SensorThings API Part 1: Sensing Version 1.1. Open Geospatial Consortium (2021). http://www.opengis.net/doc/is/sensorthings/1.1.0.

[11]  Lucio Colaiacomo, Joan Masó, Emmanuel Devys, Eric Hirschorn: OGC 08-085r8, OGC® GML in JPEG 2000 (GMLJP2) Encoding Standard. Open Geospatial Consortium (2018). http://www.opengis.net/doc/IS/GMLJP2/2.1.0.

[12]  Jeff Yutzler: OGC 12-128r18, OGC® GeoPackage Encoding Standard. Open Geospatial Consortium (2021). http://www.opengis.net/doc/IS/geopackage/1.3.1.

[13]  OGC API — Coverages — Part 1: Core (draft), Open Geospatial Consortium. https://docs.ogc.org/DRAFTS/19-087.html